Goto

Collaborating Authors

 ph0 1


Learning in Observable POMDPs, without Computationally Intractable Oracles

Neural Information Processing Systems

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms either need to make strong assumptions about the model dynamics (e.g.